Classification based extraction of numeric values from clinical narratives
نویسنده
چکیده
The robust extraction of numeric values from clinical narratives is a well known problem in clinical data warehouses. In this paper we describe a dynamic and domain-independent approach to deliver numerical described values from clinical narratives. In contrast to alternative systems, we neither use manual defined rules nor any kind of ontologies or nomenclatures. Instead we propose a topic-based system, that tackles the information extraction as a text classification problem. Hence we use machine learning to identify the crucial context features of a topicspecific numeric value by a given set of example sentences, so that the manual effort reduces to the selection of appropriate sample sentences. We describe context features of a certain numeric value by term frequency vectors which are generated by multiple document segmentation procedures. Due to this simultaneous segmentation approaches, there can be more than one context vector for a numeric value. In those cases, we choose the context vector with the highest classification confidence and suppress the rest. To test our approach, we used a dataset from a german hospital containing 12 743 narrative reports about laboratory results of Leukemia patients. We used Support Vector Machines (SVM) for classification and achieved an average accuracy of 96% on a manually labeled subset of 2073 documents, using 10-fold cross validation. This is a significant improvement over an alternative rule based system.
منابع مشابه
Developing a New Method in Object Based Classification to Updating Large Scale Maps with Emphasis on Building Feature
According to the cities expansion, updating urban maps for urban planning is important and its effectiveness is depend on the information extraction / change detection accuracy. Information extraction methods are divided into two groups, including Pixel-Based (PB) and Object-Based (OB). OB analysis has overcome the limitations of PB analysis (producing salt-pepper results and features with hole...
متن کاملExtracting Clinical Relationships from Patient Narratives
The Clinical E-Science Framework (CLEF) project has built a system to extract clinically significant information from the textual component of medical records, for clinical research, evidence-based healthcare and genotype-meets-phenotype informatics. One part of this system is the identification of relationships between clinically important entities in the text. Typical approaches to relationsh...
متن کاملClinical Relationships Extraction Techniques from Patient Narratives
The Clinical E-Science Framework (CLEF) project was used to extract important information from medical texts by building a system for the purpose of clinical research, evidence-based healthcare and genotype-meets-phenotype informatics. The system is divided into two parts, one part concerns with the identification of relationships between clinically important entities in the text. The full pars...
متن کاملA Unique Approach of Noise Elimination from Electroencephalography Signals between Normal and Meditation State
In this paper, unique approach is presented for the electroencephalography (EEG) signals analysis. This is based on Eigen values distribution of a matrix which is called as scaled Hankel matrix. This gives us a way to find out the number of Eigen values essential for noise reduction and extraction of signal in singular spectrum analysis. This paper gives us an approach to classify the EEG signa...
متن کاملMedTime: A temporal information extraction system for clinical narratives
Temporal information extraction from clinical narratives is of critical importance to many clinical applications. We participated in the EVENT/TIMEX3 track of the 2012 i2b2 clinical temporal relations challenge, and presented our temporal information extraction system, MedTime. MedTime comprises a cascade of rule-based and machine-learning pattern recognition procedures. It achieved a micro-ave...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017